| sno_name | obsnbtot | varnbtot | yearsrange | nbstatmt | nbstattot | nbstatyr |
|---|---|---|---|---|---|---|
| sl_ctd | 75,248 | 31 | 1994 to 2023 | Jan, Feb, Mar, Apr, May, Jun, Jul | 18 | 8 |
| sl_hydro | 17,271 | 50 | 1993 to 2023 | Jan, Feb, Mar, Apr, May, Jun, Jul | 21 | 14 |
| sl_piconano | 392 | 72 | 2021 to 2022 | May, Jun, Jul, Sep, Oct | 17 | 14 |
NEO Workshop
1 Introduction
1.0.1 Home made functions
The aim of the workshop is to…
2 SNOs datasets
All data treatment has been conducted with R version 4.2.2 (2022-10-31 ucrt) (Details available in Supplementary Data). Significance levels are tagged for p < .0001 with “****”, p < .001 with “***”, p < .01 with “**”, p < .05 with “*”. A quick report of each data set made by the package DataExplorer are available in docs/ folder.
2.1 BENTHOBS
Data are available on https://data.benthobs.fr/. There are different files:
- granulometry TSV file with granulometry data.
macrofauna TSV file with macrofauna data.
organicmatter TSV file with organic matter data
The BENTHOBS data set count 3 tables that has the common variables:
bdd_base, bdd_table, Laboratory, Survey, station, latitude, longitude, sampling_date, Data_source, dfMonth, dfMonthtxt, dfYear, dfQuarter, column_label, Int_Area_S, Int_Area_M, Int_Area_L
bo_granu (n= 1,580) contains 30 variables. Period covered is from 1977 to 2021, sampling are made mainly at the months of NA, on 9 different stations, with a mean of 5 per year.
bo_macro (n= 42,556) contains 39 variables. Period covered is from 1997 to 2021, sampling are made mainly at the months of NA, on 18 different stations, with a mean of 6 per year. The
density_ind_m2(ind.m-2) was calculated by dividing theCount(ind) by theSample sizebo_orga (n= 253) contains 22 variables. Period covered is from 2004 to 2020, sampling are made mainly at the months of NA, on 7 different stations, with a mean of 4 per year.
| sno_name | obsnbtot | varnbtot | yearsrange | nbstatmt | nbstattot | nbstatyr |
|---|---|---|---|---|---|---|
| bo_granu | 1,580 | 30 | 1977 to 2021 | Mar, Apr, Oct, May | 9 | 5 |
| bo_macro | 42,556 | 39 | 1997 to 2021 | Mar, Sep, Oct | 18 | 6 |
| bo_orga | 253 | 22 | 2004 to 2020 | Mar, Apr, Oct | 7 | 4 |
2.2 PHYTOBS
Data are available on https://data.phytobs.fr/ (PHYTOBS 2021). There are different files:
Analyst files containing single taxon counts.
Phytobs files containing single counts for taxon groups that are part of the SNO labelled taxon groups.
combined files aggregating the two previous tables.
The PHYTOBS data set extractetd was only Phytobs table.
- po_phyto (n= 170,776) contains 53 variables. Period covered is from 1987 to 2019, sampling are made mainly at the months of NA, on 23 different stations, with a mean of 14 per year.
| sno_name | obsnbtot | varnbtot | yearsrange | nbstatmt | nbstattot | nbstatyr |
|---|---|---|---|---|---|---|
| po_phyto | 170,776 | 53 | 1987 to 2019 | Jan, Feb, Mar, Apr, May, Jun, Jul | 23 | 14 |
2.3 SOMLIT
Data are available on https://www.somlit.fr/demande-de-donnees/. You have to request with your mail each files available. Please refer to (Liénart et al. 2017), (Liénart et al. 2018), (Cocquempot et al. 2019) and (Lheureux et al. 2022) for detail about the dataset building and history. Information on dataset : https://www.ir-ilico.fr/?SOMLIT_InteretScientifique
Parameters available are in ?@fig-parsom. The SOMLIT data set count 3 tables that has the common variables :
bdd_base, bdd_table, longitude, latitude, station, dfYear, dfQuarter, dfMonth, dfMonthtxt, sampling_date, ID_SITE, column_label, Int_Area_S, Int_Area_M, Int_Area_L
sl_ctd (n= 75,248) contains 31 variables. Period covered is from 1994 to 2023, sampling are made mainly at the months of NA, on 18 different stations, with a mean of 8 per year. The table was modified by splitting the depth information (
PROFONDEUR) into 25 levels of ~10m and then summarised by day and depth level (median md_d, ~min mi_d: 0.01th quantile and ~max mx_d: 0.99th quantile).sl_hydro (n= 17,271) contains 50 variables. Period covered is from 1993 to 2023, sampling are made mainly at the months of NA, on 21 different stations, with a mean of 14 per year. The
COEF_MAREEvalues at0were replaced byNA, the same for theMAREEat"inc".sl_piconano (n= 392) contains 72 variables. Period covered is from 2021 to 2022, sampling are made mainly at the months of NA, on 17 different stations, with a mean of 14 per year. The
COEF_MAREEvalues at0were replaced byNA, the same for theMAREEat"inc".
2.4 COASTHF
Data are available on https://data.coriolis-cotier.org/fr. In the menu, the active platform toggle button is activated and the COASTHF network is selected. All available stations has been selected. Detailed information are available on https://coast-hf.fr/. The selected buoys in data are listed in Table 4. Among variables, there are some “_ADJUSTED” field, that correspond to data that have been validated (qualified), otherwise, data are raw (only aberrant data removed). QC column caracterize the data quality, with a score from 1 (good) to 9 (bad) for each column, including date, station… Variables with LEVEL0 or LEVEL-1 suffixes are atmospheric, LEVEL1 a superficial (2-3m deep), LEVEL3 are at the bottom. Data are summarised per day to reduce volume and time process, with a median (x.md_d), max (x.mx_d) and min (x.mi_d) per day.
| Code | Name |
|---|---|
| EXIN0003 | POEM |
| EXIN0004 | SOLEMIO |
| EXIN0002 | EOL |
| EXIN0001 | ARCACHON B13 |
| 6100284 | Mesurho |
| EXIN0006 | SOLA |
| 6200021 | Vilaine Molit |
| IF000700 | SMART |
| 6200450 | Iroise Stanne |
| 6200310 | Smile LucSurMer |
| SCENES | SCENES |
| 6200443 | Carnot |
| EXIN0005 | ASTAN |
The COASTHF data set count 13 tables that has the common variables :
bdd_base, bdd_table, PLATFORM, longitude, latitude, station, dfYear, dfQuarter, dfMonth, dfMonthtxt, sampling_date, column_label, Int_Area_S, Int_Area_M, Int_Area_L
| sno_name | obsnbtot | varnbtot | yearsrange | nbstatmt | nbstattot | nbstatyr |
|---|---|---|---|---|---|---|
| POEM | 993 | 66 | 2017 to 2022 | Jan, Feb, Mar, Apr, May, Jun, Jul | 1 | 1 |
| SOLEMIO | 1,223 | 48 | 2005 to 2022 | Jan, Feb, Mar, Apr, May, Jun, Jul | 1 | 1 |
| EOL | 2,883 | 27 | 2013 to 2022 | Jan, Feb, Mar, Apr, May, Jun, Jul | 1 | 1 |
| ARCACHON B13 | 1,298 | 24 | 2017 to 2022 | Jan, Feb, Mar, Apr, May, Jun, Jul | 1 | 1 |
| Mesurho | 4,476 | 558 | 2009 to 2023 | Jan, Feb, Mar, Apr, May, Jun, Jul | 1 | 1 |
| SOLA | 483 | 51 | 2021 to 2023 | Jan, Feb, Mar, Apr, May, Jun, Jul | 1 | 1 |
| Vilaine Molit | 4,141 | 156 | 2008 to 2022 | Jan, Feb, Mar, Apr, May, Jun, Jul | 1 | 1 |
| SMART | 2,533 | 36 | 2016 to 2023 | Jan, Feb, Mar, Apr, May, Jun, Jul | 1 | 1 |
| Iroise Stanne | 8,068 | 132 | 2000 to 2023 | Jan, Feb, Mar, Apr, May, Jun, Jul | 1 | 1 |
| Smile LucSurMer | 2,351 | 171 | 2015 to 2023 | Jan, Feb, Mar, Apr, May, Jun, Jul | 1 | 1 |
| SCENES | 1,309 | 81 | 2017 to 2023 | Jan, Feb, Mar, Apr, May, Jun, Jul | 1 | 1 |
| Carnot | 5,674 | 108 | 2004 to 2023 | Jan, Feb, Mar, Apr, May, Jun, Jul | 1 | 1 |
| ASTAN | 979 | 33 | 2019 to 2023 | Jan, Feb, Mar, Apr, May, Jun, Jul | 1 | 1 |
3 Global data sets description
Correction global rename of fields after run
3.1 Map of sites
3.2 HF data treatment to BF
Example of treatments: http://r-statistics.co/Time-Series-Analysis-With-R.html https://rc2e.com/timeseriesanalysis
sno_set is the data as they are recorded. The _hf subset is a simple summary per day. The _mf subset is a summary per month of the year before, meaning, for each month summary, the median/min/max of the data in one year finishing the month previous the record month (add of subscript .md_y, .mi_y, .mx_y). The _bf subset is a summary per quarter of the year before, calculated the same way.
3.3 SNOs timeline description
All records of all tables of all SNO are represented in Figure 2.
4 Timeline region focus
Data can then be distinguished by geographical areas, with the different scale chosen: first the large one (Figure 3) that emphasize that the Channel, the Atlantic and north Brittany are of the most interest. The small scale (Figure 4) is the preferred scale reveal the challenge of the workshop. The intermediate scale (?@fig-medium_data) shows that when there is geographically close SNO data sets, they can be temporarily not that relevant.
4.1 Small area time series
4.2 Medium area time series
5 Small areas of interest detailed analysis
5.1 Manche
5.1.1 Estuaire de la Liane
5.1.2 Cabourg
5.1.3 Luc sur Mer
5.2 Nord Bretagne
5.2.1 Rade de Camaret
5.2.2 Baie de Daoulas
5.2.3 La Rance
5.2.4 Hebihens
5.2.5 Baie de Morlaix
5.2.6 Riviere de Morlaix
5.3 Atlantique
5.3.1 Antioche
5.3.2 Embouchure de Gironde
5.3.3 Comprian
5.4 Mediterranee
5.4.1 Sola
6 Detailed exploration of data
Work in progress, but for example a ridge plot for each variable of a table
Also some Correlation stats, with functions to make a correlation matrix with p values and colors, and a table with linear regression coeff and signif symbols
7 Final actions and save
Rdata are saved in different files in ‘Matrices/’ folder:
- Matrices/NEO_wshp_bo, Matrices/NEO_wshp_po, Matrices/NEO_wshp_sl, Matrices/NEO_wshp_cf : contains each SNO dataset extracted without any modification
- NEO_wshp_sno_set_raw : contains all SNO datasets in one list without any modification
- NEO_wshp_sno_set : contains all SNO datasets in one list without discarded tables
- NEO_wshp_Manche, NEO_wshp_Nord_Bretagne, NEO_wshp_Sud_Bretagne, NEO_wshp_Atlantique, NEO_wshp_Mediterranee : contains datasets of all SNO filtered by large area
- NEO_wshp_Small, NEO_wshp_Medium : contains datasets of all SNO filtered by small area in list by large and medium areas / medium in list by large area
- NEO_wshp_plots* : contains plots created in script
- NEO_wshp : contains all other variables
8 Supplementary data
8.1 Software details
─ Session info ───────────────────────────────────────────────────────────────
setting value
version R version 4.2.2 (2022-10-31 ucrt)
os Windows 10 x64 (build 19045)
system x86_64, mingw32
ui RTerm
language EN
collate French_France.utf8
ctype French_France.utf8
tz Europe/Paris
date 2023-04-28
pandoc 2.19.2 @ C:/Program Files/RStudio/bin/quarto/bin/tools/ (via rmarkdown)
quarto 1.1.189 @ C:\\PROGRA~1\\RStudio\\bin\\quarto\\bin\\quarto.exe
─ Packages ───────────────────────────────────────────────────────────────────
package * version date (UTC) lib source
beepr * 1.3 2018-06-04 [1] CRAN (R 4.2.2)
boot * 1.3-28.1 2022-11-22 [1] CRAN (R 4.2.2)
cluster * 2.1.4 2022-08-22 [1] CRAN (R 4.2.2)
clustsig * 1.1 2014-01-15 [1] CRAN (R 4.2.2)
colorspace * 2.1-0 2023-01-23 [1] CRAN (R 4.2.3)
conflicted * 1.2.0 2023-02-01 [1] CRAN (R 4.2.3)
data.table * 1.14.8 2023-02-17 [1] CRAN (R 4.2.3)
dendextend * 1.17.1 2023-03-25 [1] CRAN (R 4.2.3)
devtools * 2.4.5 2022-10-11 [1] CRAN (R 4.2.2)
dplyr * 1.1.1 2023-03-22 [1] CRAN (R 4.2.3)
figpatch * 0.2 2022-05-03 [1] CRAN (R 4.2.2)
forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.2.3)
GGally * 2.1.2 2021-06-21 [1] CRAN (R 4.2.2)
ggforce * 0.4.1 2022-10-04 [1] CRAN (R 4.2.2)
ggplot2 * 3.4.1 2023-02-10 [1] CRAN (R 4.2.3)
ggpubr * 0.6.0 2023-02-10 [1] CRAN (R 4.2.3)
ggridges * 0.5.4 2022-09-26 [1] CRAN (R 4.2.2)
ggsci * 3.0.0 2023-03-08 [1] CRAN (R 4.2.3)
grafify * 3.0.1 2023-02-07 [1] CRAN (R 4.2.3)
Hmisc * 5.0-1 2023-03-08 [1] CRAN (R 4.2.3)
htmlwidgets * 1.6.2 2023-03-17 [1] CRAN (R 4.2.3)
introdataviz * 0.0.0.9003 2023-03-02 [1] Github (psyteachr/introdataviz@0519c98)
knitr * 1.42 2023-01-25 [1] CRAN (R 4.2.3)
labdsv * 2.0-1 2019-08-04 [1] CRAN (R 4.2.2)
lattice * 0.20-45 2021-09-22 [1] CRAN (R 4.2.2)
leaflet * 2.1.2 2023-03-10 [1] CRAN (R 4.2.3)
lmtest * 0.9-40 2022-03-21 [1] CRAN (R 4.2.2)
lubridate * 1.9.2 2023-02-10 [1] CRAN (R 4.2.3)
mgcv * 1.8-42 2023-03-02 [1] CRAN (R 4.2.3)
nlme * 3.1-162 2023-01-31 [1] CRAN (R 4.2.3)
openxlsx * 4.2.5.2 2023-02-06 [1] CRAN (R 4.2.3)
pairwiseAdonis * 0.4.1 2022-12-27 [1] Github (pmartinezarbizu/pairwiseAdonis@68468fe)
pastecs * 1.3.21 2018-03-15 [1] CRAN (R 4.2.2)
patchwork * 1.1.2 2022-08-19 [1] CRAN (R 4.2.2)
performance * 0.10.2 2023-01-12 [1] CRAN (R 4.2.3)
permute * 0.9-7 2022-01-27 [1] CRAN (R 4.2.2)
plot3D * 1.4 2021-05-22 [1] CRAN (R 4.2.2)
plotly * 4.10.1 2022-11-07 [1] CRAN (R 4.2.2)
pracma * 2.4.2 2022-09-22 [1] CRAN (R 4.2.2)
purrr * 1.0.1 2023-01-10 [1] CRAN (R 4.2.3)
quantreg * 5.94 2022-07-20 [1] CRAN (R 4.2.2)
RColorBrewer * 1.1-3 2022-04-03 [1] CRAN (R 4.2.0)
readr * 2.1.4 2023-02-10 [1] CRAN (R 4.2.3)
readxl * 1.4.2 2023-02-09 [1] CRAN (R 4.2.3)
reshape2 * 1.4.4 2020-04-09 [1] CRAN (R 4.2.2)
rlist * 0.4.6.2 2021-09-03 [1] CRAN (R 4.2.2)
rstatix * 0.7.2 2023-02-01 [1] CRAN (R 4.2.3)
scales * 1.2.1 2022-08-20 [1] CRAN (R 4.2.2)
sessioninfo * 1.2.2 2021-12-06 [1] CRAN (R 4.2.2)
sf * 1.0-12 2023-03-19 [1] CRAN (R 4.2.3)
sfheaders * 0.4.2 2023-03-03 [1] CRAN (R 4.2.3)
SparseM * 1.81 2021-02-18 [1] CRAN (R 4.2.0)
stringr * 1.5.0 2022-12-02 [1] CRAN (R 4.2.2)
tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.2.3)
tidyr * 1.3.0 2023-01-24 [1] CRAN (R 4.2.3)
tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.2.3)
tmap * 3.3-3 2022-03-02 [1] CRAN (R 4.2.2)
tmaptools * 3.1-1 2021-01-19 [1] CRAN (R 4.2.2)
treemap * 2.4-3 2021-08-22 [1] CRAN (R 4.2.2)
usethis * 2.1.6 2022-05-25 [1] CRAN (R 4.2.2)
vegan * 2.6-4 2022-10-11 [1] CRAN (R 4.2.2)
wesanderson * 0.3.6 2018-04-20 [1] CRAN (R 4.2.2)
zoo * 1.8-11 2022-09-17 [1] CRAN (R 4.2.2)
[1] C:/Users/lehuen201/AppData/Local/Programs/R/R-4.2.2/library
──────────────────────────────────────────────────────────────────────────────